Chart Translation
نویسنده
چکیده
For efficiency reasons, Machine Translation systems are generally designed to eliminate ambiguities as early as possible even if delaying the decision would make a more informed choice possible. This paper takes the contrary view, arguing that essentially all choices should be deferred so that large numbers of competing translations will be produced in typical cases. Representing all the data structures in a suitable packed form, much as alternative structures are represented in a chart parser, makes this practicable. 1 Translation and Knowledge Judging by the great increase in activity in machine translation in the last few years, an outside observer might easily conclude that researchers in the field had finally reached the goal of truly practical translation systems towards which they have been striving for some forty years. Maybe some breakthrough has occurred, or maybe all the little incremental efforts has finally proved just enough to push us over an important invisible line. Insiders know differently. If we somehow seem to be winning the game that has been going against us for so long, it is not because we have learnt to play better; it is because the goal posts have been moved. Simply stated, the market for low-quality machine translation has grown from nothing to one clearly worthy of commercial interest in a matter of two or three years. Whatever the reasons may be for this change in the translation market, the Worldwide Web surely played an important part. Browsers now routinely offer happy explorers the opportunity to have the results of their quests translated into their own language at the click of a mouse. Their expectations of this process are, however, no greater than they were of the initial search. Web usage is essentially casual, even when a lot could turn on the outcome. If our search comes up with nothing, at least little time will have been lost. If we find something useful, it will be frosting on the cake—something we knew we had no real right to expect. So it is with a translation that is offered. If by reading it fast with little attention to detail, we seem to perceive something in it that touches on the subject of our quest, at best we may gain some useful information; at worst we will be amused. There will continue to be a market for this kind of translation until substantially better results can be produced with little or no increase in the price. But there is also a great and growing need for high-quality translation and, as I have argued repeatedly (Kay et al. 1994, Kay 1997), computers will do little to help fill this need until very large strides have been made towards building programs that could pass the Turing test—programs that could, in other words, successfully masquerade as human beings on the other end of a telephone or computer-mediated connection. My claim is that substantial proportion of the linguistic problems that need to be solved in order to achieve high quality translation are already fairly well in hand but that linguistics is a relatively minor part of what is required for translation. The remaining problems are not confined to any particular field of endeavor. Anything that a person might know, or believe, or suspect, or impute to the knowledge, belief or suspicions of another, could be crucial for translating the next sentence in some text. Furthermore, the sentences for which a good translation is possible only in the light of such nonlinguistic, unsystematic, knowledge are the rule rather than the exception. In short, translation is what is sometimes called an AI-complete problem. My conclusion from this has been, and continues to be, not that computers are out of place in high-quality translation, but simply that they cannot be expected to do the job alone. The human contribution to the enterprise is indispensable. However, it absolutely need not take the form of actually making the translation and, indeed, substantial increases in the quality of current fully automatic systems may be possible with contributions by humans that know only a single language. A person that knew the source language and the subject matter of the material to be translated, might be called upon to answer questions about the meanings of particular words and phrases or the referents of pronouns and definite noun phrases. If the questions are chosen with care, and the answers interpreted at a sufficiently high level of abstraction, then they may contribute to translations into several different target languages. This is important in view of observation that, while most documents are not translated at all, those that are, are usually translated into several different languages. Once beyond the narrow realm or meteorological reports, humans are always involved in translation when-
منابع مشابه
Chart-Based Transfer Rule Application in Machine Translation
Transfer-based Machine Translation systems require a procedure for choosing the set of transfer rules for generating a target language translation from a given source language sentence. In an MT system with many competing transfer rules, choosing the best set of transfer rules for translation may involve the evaluation of an explosive number of competing sets. We propose a solution to this prob...
متن کاملA Chart Generator for Shake and Bake Machine Translation
A generation algorithm based on an active chart parsing algorithm is introduced which can be used in conjunction with a Shake and Bake machine translation system. A concise Prolog implementation of the algorithm is provided, and some performance comparisons with a shift-reduce based algorithm are given which show the chart generator is much more efficient for generating all possible sentences f...
متن کاملLattice Parsing to Integrate Speech Recognition and Rule-Based Machine Translation
In this paper, we present a novel approach to integrate speech recognition and rulebased machine translation by lattice parsing. The presented approach is hybrid in two senses. First, it combines structural and statistical methods for language modeling task. Second, it employs a chart parser which utilizes manually created syntax rules in addition to scores obtained after statistical processing...
متن کاملIntegrating Phrase-based Reordering Features into a Chart-based Decoder for Machine Translation
Hiero translation models have two limitations compared to phrase-based models: 1) Limited hypothesis space; 2) No lexicalized reordering model. We propose an extension of Hiero called PhrasalHiero to address Hiero’s second problem. Phrasal-Hiero still has the same hypothesis space as the original Hiero but incorporates a phrase-based distance cost feature and lexicalized reodering features into...
متن کاملChart-based Incremental Transfer in Machine Translation
The transfer stage of a machine translation system for spontaneously spoken language in any case has to work incrementally and time-synchronously to be acceptable within natural dialogue settings. To achieve some of the necessary properties, we start from data structures and algorithms as known from chart parsing. Techniques used in this framework for analysis can be applied to the transfer sta...
متن کاملA CYK+ Variant for SCFG Decoding Without a Dot Chart
While CYK+ and Earley-style variants are popular algorithms for decoding unbinarized SCFGs, in particular for syntaxbased Statistical Machine Translation, the algorithms rely on a so-called dot chart which suffers from a high memory consumption. We propose a recursive variant of the CYK+ algorithm that eliminates the dot chart, without incurring an increase in time complexity for SCFG decoding....
متن کامل